This analysis is to look at the MDS samples that were treated with the drug HTT homoharringtonine which is a translation inhibitor. There were only 2 samples for each treatment in this arm.

Pre-processing

  1. Adapters were trimmed using cutadapt v1.16
  2. Gene expression was quantified using salmon v1.3.0
  3. TPMs were obtained for the genes using tximport 1.20.0
library(dplyr)
library(ggplot2)
library(DESeq2)
library(tximport)
library(readr)
library(tximportData)
library(readxl)
library(knitr)
library(tidyverse)
library(pheatmap)
library(RColorBrewer)
library(viridis)
library(ggrepel)
library(EnhancedVolcano)
library(fgsea)
library(limma)
library(VennDiagram)
library(UpSetR)
library(wesanderson)
library(kableExtra)
library(reshape)

Summary of Data Metrics

Summary of Data Metrics
Sample patient reads %Q30 Duplication rate % reads with adapter STAR alignment number percent aligned splices annotated salmon mapping
CD123+ HTB336 28161997 93 40 2.7 24795059 88.04 8802685 82.8170
CD123+ HTB61 17643949 90 29 2.5 13932164 78.96 3351481 68.1966
CD123+ HHT HTB336 17398408 93 42 2.7 15452341 88.81 5925372 87.9806
CD123+ HHT HTB61 22970807 93 33 2.6 20607839 89.71 4897461 75.8607

This table shows that there are relatively high duplication levels in these samples. This likely indicates that the input was low or too many PCR cycles were performed.

Summary of Data prior to analysis

Metadata Table
SampleName patient cellType percentrRNA
HTB61_CD123 HTB61 123pos 0.30
HTB61_CD132_HHT HTB61 123pos_HHT 0.14
HTB336_CD123_pos HTB336 123pos 0.38
HTB336_CD123_pos_HHT HTB336 123pos_HHT 0.17

There was little rRNA contamination in these samples. This is expected as these are from polyA selected libraries.

There are 5001 genes with more than 5 counts in all samples

Sample Heatmap and Correlation matrix

We see higher corrlation between samples than between treatments. This is expected for patient data.

PCA plot

PC1 vs PC2

We see a stronger correlation between patient than between sample type. PC1 is seperating the patients and the major loadings of this component are IFI30, HBB, MGST1, AHSP, and CREM. PC2 is separating based on treatment and the major loadings of this component are FOS, IGHM, PUF60, SLC25A6, and ICAM.

Run Differential Expression testing using DESeq2 and Calculate Gene Set Enrichment

Compare 123pos vs 123neg, 123neg vs bulk, and 123pos vs bulk

sig = padj <0.01 and abs(l2fc) >0.5

</.01>
Genes with padj value of <.01>
external_gene_name padj pvalue log2FoldChange
FOS 3.44e-16 2.40e-20 -4.45
LRPAP1 4.77e-14 6.65e-18 -4.62
TM4SF1 7.62e-11 1.59e-14 4.26
OSM 3.73e-10 1.30e-13 -5.03
HBA2 3.73e-10 1.26e-13 -3.61
ICAM1 2.69e-09 1.13e-12 -2.96
CD83 6.11e-09 2.98e-12 -2.88
HBA1 1.86e-07 1.04e-10 -5.37
WASH5P 5.59e-07 3.51e-10 -5.12
ZNF14 5.77e-07 4.02e-10 -2.74
GSDME 1.52e-06 1.16e-09 -6.11
RPS15 3.50e-06 3.17e-09 -2.74
ETV3 3.50e-06 3.18e-09 -2.40
SERTAD1 3.71e-06 3.62e-09 -3.58
ZNF133 5.18e-06 5.42e-09 -5.74
ZNF256 1.81e-05 2.02e-08 -4.74
IGLC2 1.98e-05 2.62e-08 6.19
TRAF6 1.98e-05 2.39e-08 -2.98
GADD45B 1.98e-05 2.51e-08 -3.84
ATP2A1 2.50e-05 3.48e-08 4.94
RGS1 3.13e-05 4.58e-08 -2.61
SNHG17 4.09e-05 6.27e-08 -2.44
UCP2 4.69e-05 7.53e-08 -2.56
DPH6 5.47e-05 9.15e-08 -3.96
RASGEF1B 6.50e-05 1.18e-07 -2.09
SIKE1 6.50e-05 1.14e-07 -2.65
SLC25A6 7.98e-05 1.56e-07 -2.79
ZNF625-ZNF20 7.98e-05 1.52e-07 -3.09
ANXA2 9.48e-05 1.92e-07 2.04
CHMP1B-AS1 9.51e-05 1.99e-07 -4.70
NT5DC1 1.11e-04 2.40e-07 -3.44
PDE4B 1.16e-04 2.68e-07 -2.13
BSG 1.16e-04 2.67e-07 -2.67
PHYH 1.20e-04 2.91e-07 -3.23
ZNF442 1.20e-04 2.93e-07 -2.67
CFAP20 1.60e-04 4.02e-07 -2.53
MCTP2 1.68e-04 4.53e-07 -2.63
C19orf48 1.68e-04 4.57e-07 -2.40
PUF60 1.68e-04 4.37e-07 -2.85
GSK3B 1.71e-04 4.78e-07 3.60
NUDT3 1.92e-04 5.50e-07 2.51
NDUFS7 2.47e-04 7.24e-07 -3.52
PRMT9 2.69e-04 8.07e-07 -3.32
ATF3 2.87e-04 8.80e-07 -2.68
ESS2 3.10e-04 9.72e-07 -4.32
NUFIP1 3.21e-04 1.03e-06 -4.80
KLF10 4.05e-04 1.33e-06 -2.21
PLAUR 5.10e-04 1.71e-06 -2.09
TNFAIP3 5.67e-04 1.94e-06 -2.40
IGHM 5.75e-04 2.05e-06 -2.91
MUC12 5.75e-04 2.03e-06 -5.51
NLRP3 6.16e-04 2.27e-06 -2.49
IL1RL1 6.16e-04 2.28e-06 -3.85
ABHD10 6.52e-04 2.46e-06 -3.98
ZNF140 6.59e-04 2.53e-06 -2.21
ARL4A 6.79e-04 2.70e-06 -1.94
CXCR2 6.79e-04 2.67e-06 -9.28
NLE1 6.99e-04 2.83e-06 -4.05
NCKAP1 1.05e-03 4.34e-06 3.59
PPP1R15A 1.08e-03 4.51e-06 -2.76
LTBP1 1.13e-03 4.80e-06 -7.56
NANS 1.14e-03 4.92e-06 -2.19
WDR43 1.26e-03 5.53e-06 -1.99
CDKN1A 1.27e-03 5.68e-06 -2.04
NFU1 1.27e-03 5.76e-06 -3.16
CIDEB 1.27e-03 5.90e-06 6.16
RPL8 1.27e-03 5.94e-06 -1.77
HAUS7 1.30e-03 6.24e-06 -3.36
HEMK1 1.30e-03 6.26e-06 -2.35
FFAR3 1.45e-03 7.18e-06 -8.78
ZNF394 1.45e-03 7.10e-06 -2.08
BTBD8 1.53e-03 7.70e-06 -8.97
MTX2 1.54e-03 7.83e-06 -4.46
GRAP 1.55e-03 7.98e-06 -9.19
IFFO1 1.72e-03 8.98e-06 -8.07
TNFAIP6 1.72e-03 9.11e-06 -2.49
PLA2G12A 1.75e-03 9.41e-06 -3.71
CLEC4E 1.80e-03 9.80e-06 -5.24
RNF138 1.85e-03 1.02e-05 -2.14
PTRHD1 1.85e-03 1.03e-05 -3.55
HNRNPDL 1.87e-03 1.06e-05 -1.87
RALY 1.92e-03 1.10e-05 3.82
IVD 1.95e-03 1.13e-05 -2.50
R3HCC1 2.18e-03 1.28e-05 -3.98
PMAIP1 2.91e-03 1.73e-05 -2.40
H1-4 2.91e-03 1.75e-05 -3.24
FOSB 3.00e-03 1.82e-05 -2.03
WDFY2 3.06e-03 1.90e-05 2.33
ARID4A 3.06e-03 1.88e-05 -1.90
NKTR 3.07e-03 1.93e-05 -1.91
EIF1 3.46e-03 2.19e-05 -1.97
ASH2L 3.59e-03 2.30e-05 -2.17
USP3 3.71e-03 2.41e-05 -2.25
SNX17 4.02e-03 2.64e-05 -2.79
CYB5R4 4.22e-03 2.80e-05 -2.17
POLR2E 4.28e-03 2.89e-05 -3.21
BTG2 4.28e-03 2.89e-05 -2.40
RNF144B 4.38e-03 3.02e-05 -2.40
LMAN2 4.38e-03 3.00e-05 -2.78
CXCR5 4.38e-03 3.08e-05 -9.00
IRAK2 4.38e-03 3.08e-05 -2.81
H2AC19 4.41e-03 3.14e-05 -5.21
USP11 4.52e-03 3.25e-05 -2.49
GFI1B 4.53e-03 3.32e-05 -2.86
ACSF3 4.53e-03 3.30e-05 -3.43
DOCK7 4.54e-03 3.40e-05 3.45
SLC7A6OS 4.54e-03 3.36e-05 -3.84
HSPA5 4.54e-03 3.42e-05 -1.82
JDP2 5.26e-03 4.00e-05 4.12
5.35e-03 4.10e-05 -8.86
ZBTB48 5.53e-03 4.28e-05 -2.76
RGS2 6.13e-03 4.83e-05 -2.06
6.13e-03 4.82e-05 4.33
ZNF136 6.19e-03 4.92e-05 -1.96
SAMSN1 6.33e-03 5.12e-05 -1.74
ZFP36 6.33e-03 5.16e-05 -2.29
SPPL2A 6.33e-03 5.09e-05 2.38
SRSF10 6.50e-03 5.34e-05 -1.67
POC5 6.79e-03 5.68e-05 -3.00
SNAPC1 6.79e-03 5.68e-05 -2.08
HLA-E 7.10e-03 5.99e-05 -1.62
CDC37L1 7.17e-03 6.10e-05 -2.62
SDE2 7.40e-03 6.35e-05 -1.76
MBNL3 7.42e-03 6.42e-05 -4.55
DNAJB1 8.24e-03 7.18e-05 -1.82
FLRT2 8.45e-03 7.42e-05 3.48
MIR222HG 8.68e-03 7.72e-05 2.52
CISH 8.68e-03 7.75e-05 -3.55
HIF1AN 9.24e-03 8.31e-05 -2.14
UGGT2 9.36e-03 8.49e-05 -4.59
DDA1 9.43e-03 8.61e-05 -3.39
LIPN 1.00e-02 9.20e-05 -8.60

Table showing the 132 genes with significant BH adjusted p- value of <.01.

Top 20 DEG plots

Volcano Plot

###MA plots

This PCA plot shows the first 2 principle components following normalization with DEseq2. Again we see PC1 separating by the patient with the top gene loadings of KYNU, MTC01P12, SLAMF7, CCL7, and PPBP. The PC2 is separating on the treatment and the top gene loadings are LRPAP, FOS, TM4SF1, HBA1/2, and WASH5P.

Displaying a table of ordered pathways

plot the waterfall results

All Pathways In Moustache Plot

##repeating with KEGG pathways

Top Up and Down Ranked Pathways hallmark

Top Up and Down Ranked Pathways kegg

Top ranked pathways hallmark

I can do this for any of interest, just let me know

Top ranked pathways kegg

I can do this for any of interest, just let me know

## Session Information

sessionInfo()
## R version 4.1.0 (2021-05-18)
## Platform: x86_64-apple-darwin17.0 (64-bit)
## Running under: macOS Big Sur 10.16
## 
## Matrix products: default
## BLAS:   /Library/Frameworks/R.framework/Versions/4.1/Resources/lib/libRblas.dylib
## LAPACK: /Library/Frameworks/R.framework/Versions/4.1/Resources/lib/libRlapack.dylib
## 
## locale:
## [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
## 
## attached base packages:
##  [1] grid      parallel  stats4    stats     graphics  grDevices utils    
##  [8] datasets  methods   base     
## 
## other attached packages:
##  [1] reshape_0.8.8               kableExtra_1.3.4.9000      
##  [3] wesanderson_0.3.6           UpSetR_1.4.0               
##  [5] VennDiagram_1.6.20          futile.logger_1.4.3        
##  [7] limma_3.48.3                fgsea_1.18.0               
##  [9] EnhancedVolcano_1.10.0      ggrepel_0.9.1              
## [11] viridis_0.6.1               viridisLite_0.4.0          
## [13] RColorBrewer_1.1-2          pheatmap_1.0.12            
## [15] forcats_0.5.1               stringr_1.4.0              
## [17] purrr_0.3.4                 tidyr_1.1.3                
## [19] tibble_3.1.4                tidyverse_1.3.1            
## [21] knitr_1.34                  readxl_1.3.1               
## [23] tximportData_1.20.0         readr_2.0.1                
## [25] tximport_1.20.0             DESeq2_1.32.0              
## [27] SummarizedExperiment_1.22.0 Biobase_2.52.0             
## [29] MatrixGenerics_1.4.3        matrixStats_0.60.1         
## [31] GenomicRanges_1.44.0        GenomeInfoDb_1.28.2        
## [33] IRanges_2.26.0              S4Vectors_0.30.0           
## [35] BiocGenerics_0.38.0         ggplot2_3.3.5              
## [37] dplyr_1.0.7                
## 
## loaded via a namespace (and not attached):
##   [1] backports_1.2.1        fastmatch_1.1-3        BiocFileCache_2.0.0   
##   [4] systemfonts_1.0.2      plyr_1.8.6             splines_4.1.0         
##   [7] crosstalk_1.1.1        BiocParallel_1.26.2    digest_0.6.27         
##  [10] htmltools_0.5.2        fansi_0.5.0            magrittr_2.0.1        
##  [13] memoise_2.0.0          tzdb_0.1.2             Biostrings_2.60.2     
##  [16] annotate_1.70.0        modelr_0.1.8           extrafont_0.17        
##  [19] vroom_1.5.4            extrafontdb_1.0        svglite_2.0.0         
##  [22] prettyunits_1.1.1      colorspace_2.0-2       rappdirs_0.3.3        
##  [25] blob_1.2.2             rvest_1.0.1            haven_2.4.3           
##  [28] xfun_0.26              crayon_1.4.1           RCurl_1.98-1.4        
##  [31] jsonlite_1.7.2         genefilter_1.74.0      survival_3.2-13       
##  [34] glue_1.4.2             gtable_0.3.0           zlibbioc_1.38.0       
##  [37] XVector_0.32.0         webshot_0.5.2          DelayedArray_0.18.0   
##  [40] proj4_1.0-10.1         Rttf2pt1_1.3.9         maps_3.3.0            
##  [43] scales_1.1.1           futile.options_1.0.1   DBI_1.1.1             
##  [46] Rcpp_1.0.7             progress_1.2.2         xtable_1.8-4          
##  [49] bit_4.0.4              DT_0.18                htmlwidgets_1.5.3     
##  [52] httr_1.4.2             ellipsis_0.3.2         farver_2.1.0          
##  [55] pkgconfig_2.0.3        XML_3.99-0.7           sass_0.4.0            
##  [58] dbplyr_2.1.1           locfit_1.5-9.4         utf8_1.2.2            
##  [61] labeling_0.4.2         tidyselect_1.1.1       rlang_0.4.11          
##  [64] AnnotationDbi_1.54.1   munsell_0.5.0          cellranger_1.1.0      
##  [67] tools_4.1.0            cachem_1.0.6           cli_3.0.1             
##  [70] generics_0.1.0         RSQLite_2.2.8          broom_0.7.9           
##  [73] evaluate_0.14          fastmap_1.1.0          yaml_2.2.1            
##  [76] bit64_4.0.5            fs_1.5.0               KEGGREST_1.32.0       
##  [79] ash_1.0-15             formatR_1.11           ggrastr_0.2.3         
##  [82] xml2_1.3.2             biomaRt_2.48.3         compiler_4.1.0        
##  [85] rstudioapi_0.13        filelock_1.0.2         curl_4.3.2            
##  [88] beeswarm_0.4.0         png_0.1-7              reprex_2.0.1          
##  [91] geneplotter_1.70.0     bslib_0.2.5.1          stringi_1.7.4         
##  [94] highr_0.9              ggalt_0.4.0            lattice_0.20-44       
##  [97] Matrix_1.3-4           vctrs_0.3.8            pillar_1.6.2          
## [100] lifecycle_1.0.0        jquerylib_0.1.4        data.table_1.14.0     
## [103] bitops_1.0-7           R6_2.5.1               KernSmooth_2.23-20    
## [106] gridExtra_2.3          vipor_0.4.5            lambda.r_1.2.4        
## [109] MASS_7.3-54            assertthat_0.2.1       withr_2.4.2           
## [112] GenomeInfoDbData_1.2.6 hms_1.1.0              rmarkdown_2.11        
## [115] lubridate_1.7.10       ggbeeswarm_0.6.0